A Hitchhiker’s Guide to Statistical Tests for Assessing Randomized Algorithms in Software Engineering1
نویسنده
چکیده
Randomized algorithms have been used to successfully address many different types of software engineering problems. This type of algorithms entail a significant degree of randomness as part of their logic. Randomized algorithms are useful to address difficult problems where a precise solution cannot be derived in a deterministic way within reasonable time. However, randomized algorithms can produce different results on every run when applied to the same problem instance. It is hence important to assess the effectiveness of randomized algorithms by collecting data from a large enough number of runs. The rigorous use of statistical tests is then essential to provide support to the conclusions derived by analyzing such data. In this paper, we provide a systematic review of the use of randomized algorithms in selected software engineering venues in 2009/2010. Its goal is not to perform a complete survey but to get a representative and up-to-date snapshot of current practice in software engineering research. We show that randomized algorithms are used in a significant percentage of papers but that, in most cases, randomness is not properly accounted for. This casts doubts on the validity of most empirical results assessing randomized algorithms for various applications. There are numerous statistical tests, based on different assumptions, and it is not always clear when and how to use these tests. We hence provide practical guidelines to support empirical research on randomized algorithms in software engineering. Keyword: Statistical difference, effect size, parametric test, non-parametric test, confidence interval, Bonferroni adjustment, systematic review, survey.
منابع مشابه
The effect of logbook as a study guide in dentistry training
Introduction: Although logbook is a useful tool in learning and assessment of the student, its use in the education of undergraduate dentistry students is not well-established. The present study was conducted to assess the effect of logbook as a study guide and an effective method for assessment of the students in the fixed prosthesis course. Methods: This quasi-experimental study was performed...
متن کاملWilkinson’s tests and econometric software
The Wilkinson Tests, entry-level tests for assessing the numerical accuracy of statistical computations, have been applied to statistical software packages. Some software developers, having failed these tests, have corrected deficiencies in subsequent versions. Thus these tests have had a meliorative impact on the state of statistical software. These same tests are applied to several econometri...
متن کاملThe Effect of Rural Guide Plan on Objective Quality of Life among Rural Communities in Fariman County
The rural guide plan is the most important tool in the management of rural development in Iran. The final purpose of the plan, improvement of life quality and providing a safe and attractive environment to live in rural areas. The aim of this study is to emphasize the characteristics rural guide plans, which include: Improve the quality of housing, street network, land use and access to service...
متن کاملOn the Multivariate Rasch Model: Assessing Collaboration in Multiple Choice Tests
We examine the Rasch model for latent structure para- meters in binary and multiple response questionnaires and develop methodologies and data-analytic tools for assessing collaboration/che- ating in multiple choice tests
متن کاملAssessing the internal structure of the Technology Acceptance Model in order to present the persian norm of online health information seeking
Introduction: The present study was conducted by explaining the internal norm of Davis's technology acceptance model in online health information search among Iranian students in order to provide a local model. Methods: The current research is descriptive and was carried out using a survey method. The research community is the students of Ahvaz Jundishapur University of Medical Sciences in all...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011